Distribution of Inter-Node Distances in Digital Trees
نویسندگان
چکیده
We investigate distances between pairs of nodes in digital trees (digital search trees (DST), and tries). By analytic techniques, such as the Mellin Transform and poissonization, we describe a program to determine the moments of these distances. The program is illustrated on the mean and variance. One encounters delayed Mellin transform equations, which we solve by inspection. Interestingly, the unbiased case gives a bounded variance, whereas the biased case gives a variance growing with the number of keys. It is therefore possible in the biased case to show that an appropriately normalized version of the distance converges to a limit. The complexity of moment calculation increases substantially with each higher moment; A shortcut to the limit is needed via a method that avoids the computation of all moments. Toward this end, we utilize the contraction method to show that in biased digital search trees the distribution of a suitably normalized version of the distances approaches a limit that is the fixed-point solution (in the Wasserstein space) of a distributional equation. An explicit solution to the fixed-point equation is readily demonstrated to be Gaussian.
منابع مشابه
Limit distribution of the degrees in scaled attachment random recursive trees
We study the limiting distribution of the degree of a given node in a scaled attachment random recursive tree, a generalized random recursive tree, which is introduced by Devroye et. al (2011). In a scaled attachment random recursive tree, every node $i$ is attached to the node labeled $lfloor iX_i floor$ where $X_0$, $ldots$ , $X_n$ is a sequence of i.i.d. random variables, with support in [0,...
متن کاملProbabilistic analysis of the asymmetric digital search trees
In this paper, by applying three functional operators the previous results on the (Poisson) variance of the external profile in digital search trees will be improved. We study the profile built over $n$ binary strings generated by a memoryless source with unequal probabilities of symbols and use a combinatorial approach for studying the Poissonized variance, since the probability distribution o...
متن کاملGenome analysis with inter-nucleotide distances
MOTIVATION DNA sequences can be represented by sequences of four symbols, but it is often useful to convert the symbols into real or complex numbers for further analysis. Several mapping schemes have been used in the past, but they seem unrelated to any intrinsic characteristic of DNA. The objective of this work was to find a mapping scheme directly related to DNA characteristics and that would...
متن کاملTESTING FOR “RANDOMNESS” IN SPATIAL POINT PATTERNS, USING TEST STATISTICS BASED ON ONE-DIMENSIONAL INTER-EVENT DISTANCES
To test for “randomness” in spatial point patterns, we propose two test statistics that are obtained by “reducing” two-dimensional point patterns to the one-dimensional one. Also the exact and asymptotic distribution of these statistics are drawn.
متن کاملTowards testing the "honeycomb rippling model" in cerrado.
Savannas are tropical formations in which trees and grasses coexist. According to the "honeycomb rippling model", inter-tree competition leads to an effect of trees growing and dying due to competition, which, at fine spatial scale, would resemble honeycomb rippling. The model predicts that the taller the trees, the higher the inter-tree distances and the evenness of inter-tree distances. The m...
متن کامل